Field of the Invention
The present invention generally relates to a method and apparatus for analysis of speech, and more particularly to a method and apparatus for real-time speech analysis.
Description of the Related Art
Speech is an integral part of our daily lives. Accurate speech (e.g. pronunciation, grammar, etc.) plays an important role in effective and efficient communication. Being able to speak effectively may allow one to be readily understood, sound confident, and get an important point across clearly.
Conventional devices and techniques of correcting and improving speech include both human instruction as well as computer aided tools.
In a conventional human instruction approach, a human teacher (i.e. speech-language trainer, linguist, etc.) is employed to aid in correction and improvement of speech. For example, one may attend an in-person workshop, or complete an online class.
Use of a live teacher, however, can require large amounts of time. Furthermore, the cost is often very high. Additionally, using such a method lacks much-needed flexibility.
In a conventional computer aided tool, a user opens software and reads a text (pre-selected or randomly selected) shown by the software. The computer analyzes the user's sound track and identifies errors. The computer may analyze the speech, for example, in terms of how close the speech is to a desired pronunciation, or utilize a speech recognition component to convert the speech input to text, and then measure how close the converted text is to the original text.
Such computer aided tools, however, do not provide a personal touch. Further, it is difficult for the computer to represent the user's actual, real-life speech content. Additionally, a user typically still needs to dedicate much time engaging with the tool.
Speech recognition components of conventional tools are pre-trained, and thus highly impersonal. Indeed, conventional computer aided tools cannot dynamically adapt to the content in the user's speech or in the user's conversations with others.
Conventional approaches also require active practicing. Pre-selected text may not correspond to words and phrases most frequently spoken by the user. With conventional techniques, it can be difficult to cover certain things habitually spoken by the user, for example, some technical words.